AITopics | disentanglement loss

Collaborating Authors

disentanglement loss

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

97108695bd93b6be52fa0334874c8722-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 06:38:04 GMT

atom, disentanglement loss, experiment, (16 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback

MCM: Multi-layer Concept Map for Efficient Concept Learning from Masked Images

Sun, Yuwei, Mi, Lu, Fujisawa, Ippei, Kanai, Ryota

arXiv.org Artificial IntelligenceJan-31-2025

Masking strategies commonly employed in natural language processing are still underexplored in vision tasks such as concept learning, where conventional methods typically rely on full images. However, using masked images diversifies perceptual inputs, potentially offering significant advantages in concept learning with large-scale Transformer models. To this end, we propose Multi-layer Concept Map (MCM), the first work to devise an efficient concept learning method based on masked images. In particular, we introduce an asymmetric concept learning architecture by establishing correlations between different encoder and decoder layers, updating concept tokens using backward gradients from reconstruction tasks. The learned concept tokens at various levels of granularity help either reconstruct the masked image patches by filling in gaps or guide the reconstruction results in a direction that reflects specific concepts. Moreover, we present both quantitative and qualitative results across a wide range of metrics, demonstrating that MCM significantly reduces computational costs by training on fewer than 75% of the total image patches while enhancing concept prediction performance. Additionally, editing specific concept tokens in the latent space enables targeted image generation from masked images, aligning both the visible contextual patches and the provided concepts. By further adjusting the testing time mask ratio, we could produce a range of reconstructions that blend the visible patches with the provided concepts, proportional to the chosen ratios.

artificial intelligence, concept token, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.00266

Country: Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.61)

Add feedback

An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement

Yang, Tzu-Ting, Wang, Hsin-Wei, Wang, Yi-Cheng, Lin, Chi-Han, Chen, Berlin

arXiv.org Artificial IntelligenceFeb-26-2024

With the massive developments of end-to-end (E2E) neural networks, recent years have witnessed unprecedented breakthroughs in automatic speech recognition (ASR). However, the codeswitching phenomenon remains a major obstacle that hinders ASR from perfection, as the lack of labeled data and the variations between languages often lead to degradation of ASR performance. In this paper, we focus exclusively on improving the acoustic encoder of E2E ASR to tackle the challenge caused by the codeswitching phenomenon. Our main contributions are threefold: First, we introduce a novel disentanglement loss to enable the lower-layer of the encoder to capture inter-lingual acoustic information while mitigating linguistic confusion at the higher-layer of the encoder. Second, through comprehensive experiments, we verify that our proposed method outperforms the prior-art methods using pretrained dual-encoders, meanwhile having access only to the codeswitching corpus and consuming half of the parameterization. Third, the apparent differentiation of the encoders' output features also corroborates the complementarity between the disentanglement loss and the mixture-of-experts (MoE) architecture.

encoder, proceedings, speech recognition, (17 more...)

arXiv.org Artificial Intelligence

2402.17189

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.04)
Asia > Malaysia (0.04)
Asia > East Asia (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Arbitrary Style Transfer via Multi-Adaptation Network

Deng, Yingying, Tang, Fan, Dong, Weiming, Sun, Wen, Huang, Feiyue, Xu, Changsheng

arXiv.org Artificial IntelligenceAug-16-2020

Arbitrary style transfer is a significant topic with research value and application prospect. A desired style transfer, given a content image and referenced style painting, would render the content image with the color tone and vivid stroke patterns of the style painting while synchronously maintaining the detailed content structure information. Style transfer approaches would initially learn content and style representations of the content and style references and then generate the stylized images guided by these representations. In this paper, we propose the multi-adaptation network which involves two self-adaptation (SA) modules and one co-adaptation (CA) module: the SA modules adaptively disentangle the content and style representations, i.e., content SA module uses position-wise self-attention to enhance content representation and style SA module uses channel-wise self-attention to enhance style representation; the CA module rearranges the distribution of style representation based on content representation distribution by calculating the local similarity between the disentangled content and style features in a non-local fashion. Moreover, a new disentanglement loss function enables our network to extract main style patterns and exact content structures to adapt to various input images, respectively. Various qualitative and quantitative experiments demonstrate that the proposed multi-adaptation network leads to better results than the state-of-the-art style transfer methods.

artificial intelligence, content image, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2005.13219

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Synthesized Policies for Transfer and Adaptation across Tasks and Environments

Hu, Hexiang, Chen, Liyu, Gong, Boqing, Sha, Fei

arXiv.org Machine LearningApr-5-2019

The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence. In this paper, we consider the problem of learning to simultaneously transfer across both environments (ENV) and tasks (TASK), probably more importantly, by learning from only sparse (ENV, TASK) pairs out of all the possible combinations. We propose a novel compositional neural network architecture which depicts a meta rule for composing policies from the environment and task embeddings. Notably, one of the main challenges is to learn the embeddings jointly with the meta rule. We further propose new training methods to disentangle the embeddings, making them both distinctive signatures of the environments and tasks and effective building blocks for composing the policies. Experiments on GridWorld and Thor, of which the agent takes as input an egocentric view, show that our approach gives rise to high success rates on all the (ENV, TASK) pairs after learning from only 40\% of them.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1904.03276

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback